Improving lookup and query execution performance in distributed Big Data systems using Cuckoo Filter
نویسندگان
چکیده
Abstract Performance is a critical concern when reading and writing data from billions of records stored in Big Data warehouse. We introduce two scopes for query performance improvement. One to improve the lookup queries after deletion systems that use Eventual Consistency. propose scheme by using Cuckoo Filter. Another scope improvement avoid unnecessary network round-trips querying remote nodes distributed cluster it known do not have requested partition data. probabilistic filters are looked up before so resulting no can be skipped passing through network. evaluate our schemes with Cassandra real dataset show each 2 x .
منابع مشابه
Distributed query execution system for Transactional Database using Lookup Table
As data volumes are incrementing rigorously, it is essential to store such large amount of data distributed across many machines. In OLTP databases, the most common strategy for scaling database workload is to horizontally partition the database using hash or range partitioning. It works well in many simple applications such as an email application. Transactions that access few tuples do not ru...
متن کاملA High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure
The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...
متن کاملA High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure
The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...
متن کاملData Warehousing and OLAP: Improving Query Performance Using Distributed Computing
Data warehouses are used to store large amounts of data. This data is often used for On-Line Analytical Processing (OLAP) where short response times are essential for on-line decision support. One of the most important requirements of a data warehouse server is the query performance. The principal aspect from the user perspective is how quickly the server processes a given query: “the data ware...
متن کاملImproving Query Processing Performance in Large Distributed Database Management Systems
The dream of computing power as readily available as the electricity in a wall socket is coming closer to reality with the arrival of grid and cloud computing. At the same time, databases grow to sizes beyond what can be efficiently managed by single server systems. There is a need for efficient distributed database management systems (DBMSs). Current distributed DBMSs are not built to scale to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Big Data
سال: 2022
ISSN: ['2196-1115']
DOI: https://doi.org/10.1186/s40537-022-00563-w